CorporAl: a Method and Tool for Handling Overlapping Parallel Corpora
نویسندگان
چکیده
منابع مشابه
CorporAl: a Method and Tool for Handling Overlapping Parallel Corpora
This work introduces amethod and tool for handling overlapping parallel corpora – i.e. corpora that are based on the same source material. The method is insensitive to minor changes in the text, different segmentation levels of the corpora and omitted material from either corpora. The aim is to detect matching sentence pairs and either produce combinations of the overlapping corpora or compare ...
متن کاملExperiments on Processing Overlapping Parallel Corpora
The number and sizes of parallel corpora keep growing, which makes it necessary to have automatic methods of processing them: combining, checking and improving corpora quality, etc. We here introduce a method which enables performing many of these by exploiting overlapping parallel corpora. The method finds the correspondence between sentence pairs in two corpora: first the corresponding langua...
متن کاملPolyphraZ: A Tool For The Management Of Parallel Corpora
The PolyphraZ tool is being developed in the framework of the TraCorpEx project (Translation of Corpora of Examples), to manage parallel multilingual corpora through the web. Corpus files (monolingual or multilingual) are firstly converted to a standard coding (CXM.dtd, UTF8). Then, they are assembled (CPXM.dtd) to visualize them in parallel through the web. In a third stage, they are put in a ...
متن کاملA language-independent method for the alignement of parallel corpora
The automatic alignment of parallel corpora is a very rich source of information for automatic translation, multilingual document indexing, information retrieval, etc. The rapid growth of the use of “ minority ” languages in online documents makes it necessary to develop methods that can easily adapt to any language. We present an evolution over previous works, notably by Church and Gale [1], t...
متن کاملA Parallel Overlapping Time-Domain Decomposition Method for ODEs
We introduce an overlapping time-domain decomposition for linear initial-value 7 problems which gives rise to an efficient solution method for parallel computers without 8 resorting to the frequency domain. This parallel method exploits the fact that homogeneous 9 initial-value problems can be integrated much faster than inhomogeneous problems by using 10 an efficient Arnoldi approximation for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Prague Bulletin of Mathematical Linguistics
سال: 2010
ISSN: 1804-0462,0032-6585
DOI: 10.2478/v10108-010-0021-7